

Docket No.: OT2.P59

which there are p=8 subinstruction fields, there are not more than  $2^8 = 256$  situations. Some situations turn out to be the same, so the number of situations is slightly less than 256. To cover every such situation, however, there are 'p' control bits in the set 37 of instruction-compression control bits. Thus, in one embodiment there are 'p' control bits included with each instruction.

16

However, as the instruction width increases, it may be undesirable to add so many extra control bits for subinstruction sharing. In particular, the cost of so many bits may seem excessive when there tends to be a pattern among the subinstructions redundancies that come up over and over again in practice. As a result, in the preferred embodiments the number of control bits is reduced to less than p to handle a prescribed number of the possible 2<sup>p</sup> subinstruction sharing situations. Different processors are designed for different application where the pattern of subinstruction redundancies varies. Further the cases of subinstruction redundancies covered for subinstruction sharing are strategically selected for a given processor to have greatest impact on those applications for which the processor is targeted, (e.g., for image processing applications). The preferred embodiments described in the prior sections relate to subinstruction sharing scenarios situations that have been found to occur in strategically important tight loops of common image processing functions.

An advantage of the invention is that the required instruction space in an instruction cache is effectively reduced for VLIW instructions. In particular, for some functions executed during image processing algorithms have occupy tight loops, it is possible to maintain the tight loop without thrashing, where otherwise, thrashing would occur.

Another advantage is that by avoiding some redundancies in VLIW subinstructions, fewer instruction bits are needed, and correspondingly program size is reduced. In addition, efficiency of the instruction cache usage is improved, and instruction fetch bandwidth is increased.

Fig. 8 shows a set 36 of control bit groups 38. Eeach group 38 includes a plurality of bits 40.

Although a preferred embodiment of the invention has been illustrated and described, various alternatives, modifications and equivalents may be used. Therefore, the foregoing description should not be taken as limiting the scope of the inventions which are defined by the appended claims.

5

10

15

20

25

30